{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Session 4: Visualizing Representations\n",
    "\n",
    "## Assignment: Deep Dream and Style Net\n",
    "\n",
    "<p class='lead'>\n",
    "Creative Applications of Deep Learning with Google's Tensorflow  \n",
    "Parag K. Mital  \n",
    "Kadenze, Inc.\n",
    "</p>\n",
    "\n",
    "# Overview\n",
    "\n",
    "In this homework, we'll first walk through visualizing the gradients of a trained convolutional network.  Recall from the last session that we had trained a variational convolutional autoencoder.  We also trained a deep convolutional network.  In both of these networks, we learned only a few tools for understanding how the model performs.  These included measuring the loss of the network and visualizing the `W` weight matrices and/or convolutional filters of the network.\n",
    "\n",
    "During the lecture we saw how to visualize the gradients of Inception, Google's state of the art network for object recognition.  This resulted in a much more powerful technique for understanding how a network's activations transform or accentuate the representations in the input space.  We'll explore this more in Part 1.\n",
    "\n",
    "We also explored how to use the gradients of a particular layer or neuron within a network with respect to its input for performing \"gradient ascent\". This resulted in Deep Dream.  We'll explore this more in Parts 2-4.\n",
    "\n",
    "We also saw how the gradients at different layers of a convolutional network could be optimized for another image, resulting in the separation of content and style losses, depending on the chosen layers.  This allowed us to synthesize new images that shared another image's content and/or style, even if they came from separate images.  We'll explore this more in Part 5.\n",
    "\n",
    "Finally, you'll packaged all the GIFs you create throughout this notebook and upload them to Kadenze.\n",
    "\n",
    "\n",
    "<a name=\"learning-goals\"></a>\n",
    "# Learning Goals\n",
    "\n",
    "* Learn how to inspect deep networks by visualizing their gradients\n",
    "* Learn how to \"deep dream\" with different objective functions and regularization techniques\n",
    "* Learn how to \"stylize\" an image using content and style losses from different images\n",
    "\n",
    "\n",
    "# Table of Contents\n",
    "\n",
    "<!-- MarkdownTOC autolink=true autoanchor=true bracket=round -->\n",
    "\n",
    "- [Part 1 - Pretrained Networks](#part-1---pretrained-networks)\n",
    "  - [Graph Definition](#graph-definition)\n",
    "  - [Preprocess/Deprocessing](#preprocessdeprocessing)\n",
    "  - [Tensorboard](#tensorboard)\n",
    "  - [A Note on 1x1 Convolutions](#a-note-on-1x1-convolutions)\n",
    "  - [Network Labels](#network-labels)\n",
    "  - [Using Context Managers](#using-context-managers)\n",
    "- [Part 2 - Visualizing Gradients](#part-2---visualizing-gradients)\n",
    "- [Part 3 - Basic Deep Dream](#part-3---basic-deep-dream)\n",
    "- [Part 4 - Deep Dream Extensions](#part-4---deep-dream-extensions)\n",
    "  - [Using the Softmax Layer](#using-the-softmax-layer)\n",
    "  - [Fractal](#fractal)\n",
    "  - [Guided Hallucinations](#guided-hallucinations)\n",
    "  - [Further Explorations](#further-explorations)\n",
    "- [Part 5 - Style Net](#part-5---style-net)\n",
    "  - [Network](#network)\n",
    "  - [Content Features](#content-features)\n",
    "  - [Style Features](#style-features)\n",
    "  - [Remapping the Input](#remapping-the-input)\n",
    "  - [Content Loss](#content-loss)\n",
    "  - [Style Loss](#style-loss)\n",
    "  - [Total Variation Loss](#total-variation-loss)\n",
    "  - [Training](#training)\n",
    "- [Assignment Submission](#assignment-submission)\n",
    "\n",
    "<!-- /MarkdownTOC -->"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "# First check the Python version\n",
    "import sys\n",
    "if sys.version_info < (3,4):\n",
    "    print('You are running an older version of Python!\\n\\n',\n",
    "          'You should consider updating to Python 3.4.0 or',\n",
    "          'higher as the libraries built for this course',\n",
    "          'have only been tested in Python 3.4 and higher.\\n')\n",
    "    print('Try installing the Python 3.5 version of anaconda'\n",
    "          'and then restart `jupyter notebook`:\\n',\n",
    "          'https://www.continuum.io/downloads\\n\\n')\n",
    "\n",
    "# Now get necessary libraries\n",
    "try:\n",
    "    import os\n",
    "    import numpy as np\n",
    "    import matplotlib.pyplot as plt\n",
    "    from skimage.transform import resize\n",
    "    from skimage import data\n",
    "    from scipy.misc import imresize\n",
    "    from scipy.ndimage.filters import gaussian_filter\n",
    "    import IPython.display as ipyd\n",
    "    import tensorflow as tf\n",
    "    from libs import utils, gif, datasets, dataset_utils, vae, dft, vgg16, nb_utils\n",
    "except ImportError:\n",
    "    print(\"Make sure you have started notebook in the same directory\",\n",
    "          \"as the provided zip file which includes the 'libs' folder\",\n",
    "          \"and the file 'utils.py' inside of it.  You will NOT be able\",\n",
    "          \"to complete this assignment unless you restart jupyter\",\n",
    "          \"notebook inside the directory created by extracting\",\n",
    "          \"the zip file or cloning the github repo.  If you are still\")\n",
    "\n",
    "# We'll tell matplotlib to inline any drawn figures like so:\n",
    "%matplotlib inline\n",
    "plt.style.use('ggplot')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "# Bit of formatting because I don't like the default inline code style:\n",
    "from IPython.core.display import HTML\n",
    "HTML(\"\"\"<style> .rendered_html code { \n",
    "    padding: 2px 4px;\n",
    "    color: #c7254e;\n",
    "    background-color: #f9f2f4;\n",
    "    border-radius: 4px;\n",
    "} </style>\"\"\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a name=\"part-1---pretrained-networks\"></a>\n",
    "# Part 1 - Pretrained Networks\n",
    "\n",
    "In the libs module, you'll see that I've included a few modules for loading some state of the art networks.  These include:\n",
    "\n",
    "* [Inception v3](https://github.com/tensorflow/models/tree/master/inception)\n",
    "    - This network has been trained on ImageNet and its finaly output layer is a softmax layer denoting 1 of 1000 possible objects (+ 8 for unknown categories).  This network is about only 50MB!\n",
    "* [Inception v5](https://github.com/tensorflow/models/tree/master/inception)\n",
    "    - This network has been trained on ImageNet and its finaly output layer is a softmax layer denoting 1 of 1000 possible objects (+ 8 for unknown categories).  This network is about only 50MB!  It presents a few extensions to v5 which are not documented anywhere that I've found, as of yet...\n",
    "* [Visual Group Geometry @ Oxford's 16 layer](http://www.robots.ox.ac.uk/~vgg/research/very_deep/)\n",
    "    - This network has been trained on ImageNet and its finaly output layer is a softmax layer denoting 1 of 1000 possible objects.  This model is nearly half a gigabyte, about 10x larger in size than the inception network.  The trade off is that it is very fast.\n",
    "* [Visual Group Geometry @ Oxford's Face Recognition](http://www.robots.ox.ac.uk/~vgg/software/vgg_face/)\n",
    "    - This network has been trained on the VGG Face Dataset and its final output layer is a softmax layer denoting 1 of 2622 different possible people.\n",
    "* [Illustration2Vec](http://illustration2vec.net)\n",
    "    - This network has been trained on illustrations and manga and its final output layer is 4096 features.\n",
    "* [Illustration2Vec Tag](http://illustration2vec.net)\n",
    "    - Please do not use this network if you are under the age of 18 (seriously!)\n",
    "    - This network has been trained on manga and its final output layer is one of 1539 labels.\n",
    "\n",
    "When we use a pre-trained network, we load a network's definition and its weights which have already been trained.  The network's definition includes a set of operations such as convolutions, and adding biases, but all of their values, i.e. the weights, have already been trained.  \n",
    "\n",
    "<a name=\"graph-definition\"></a>\n",
    "## Graph Definition\n",
    "\n",
    "In the libs folder, you will see a few new modules for loading the above pre-trained networks. Each module is structured similarly to help you understand how they are loaded and include example code for using them.  Each module includes a `preprocess` function for using before sending the image to the network.  And when using deep dream techniques, we'll be using the `deprocess` function to undo the `preprocess` function's manipulations.\n",
    "\n",
    "Let's take a look at loading one of these.  Every network except for `i2v` includes a key 'labels' denoting what labels the network has been trained on.  If you are under the age of 18, please do not use the `i2v_tag model`, as its labels are unsuitable for minors.\n",
    "\n",
    "Let's load the libaries for the different pre-trained networks:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "from libs import vgg16, inception, i2v"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now we can load a pre-trained network's graph and any labels.  Explore the different networks in your own time.\n",
    "\n",
    "<h3><font color='red'>TODO! COMPLETE THIS SECTION!</font></h3>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "# Stick w/ Inception for now, and then after you see how\n",
    "# the next few sections work w/ this network, come back\n",
    "# and explore the other networks.\n",
    "\n",
    "net = inception.get_inception_model(version='v5')\n",
    "# net = inception.get_inception_model(version='v3')\n",
    "# net = vgg16.get_vgg_model()\n",
    "# net = vgg16.get_vgg_face_model()\n",
    "# net = i2v.get_i2v_model()\n",
    "# net = i2v.get_i2v_tag_model()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Each network returns a dictionary with the following keys defined.  Every network has a key for \"labels\" except for \"i2v\", since this is a feature only network, e.g. an unsupervised network, and does not have labels."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "print(net.keys())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a name=\"preprocessdeprocessing\"></a>\n",
    "## Preprocess/Deprocessing\n",
    "\n",
    "Each network has a preprocessing/deprocessing function which we'll use before sending the input to the network.  This preprocessing function is slightly different for each network.  Recall from the previous sessions what preprocess we had done before sending an image to a network.  We would often normalize the input by subtracting the mean and dividing by the standard deviation.  We'd also crop/resize the input to a standard size.  We'll need to do this for each network except for the Inception network, which is a true convolutional network and does not require us to do this (will be explained in more depth later). \n",
    "\n",
    "Whenever we `preprocess` the image, and want to visualize the result of adding back the gradient to the input image (when we use deep dream), we'll need to use the `deprocess` function stored in the dictionary.  Let's explore how these work.  We'll confirm this is performing the inverse operation, let's try to preprocess the image, then I'll have you try to deprocess it."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "# First, let's get an image:\n",
    "og = plt.imread('clinton.png')[..., :3]\n",
    "plt.imshow(og)\n",
    "print(og.min(), og.max())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's now try preprocessing this image.  The function for preprocessing is inside the module we used to load it.  For instance, for `vgg16`, we can find the `preprocess` function as `vgg16.preprocess`, or for `inception`, `inception.preprocess`, or for `i2v`, `i2v.preprocess`.  Or, we can just use the key `preprocess` in our dictionary `net`, as this is just convenience for us to access the corresponding preprocess function."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "# Now call the preprocess function.  This will preprocess our\n",
    "# image ready for being input to the network, except for changes\n",
    "# to the dimensions.  I.e., we will still need to convert this\n",
    "# to a 4-dimensional Tensor once we input it to the network.\n",
    "# We'll see how that works later.\n",
    "img = net['preprocess'](og)\n",
    "print(img.min(), img.max())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's undo the preprocessing.  Recall that the `net` dictionary has the key `deprocess` which is the function we need to use on our processed image, `img`.\n",
    "\n",
    "<h3><font color='red'>TODO! COMPLETE THIS SECTION!</font></h3>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "deprocessed = ...\n",
    "plt.imshow(deprocessed)\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a name=\"tensorboard\"></a>\n",
    "## Tensorboard\n",
    "\n",
    "I've added a utility module called `nb_utils` which includes a function `show_graph`.  This will use [Tensorboard](https://www.tensorflow.org/versions/r0.10/how_tos/graph_viz/index.html) to draw the computational graph defined by the various Tensorflow functions.  I didn't go over this during the lecture because there just wasn't enough time!  But explore it in your own time if it interests you, as it is a really unique tool which allows you to monitor your network's training progress via a web interface.  It even lets you monitor specific variables or processes within the network, e.g. the reconstruction of an autoencoder, without having to print to the console as we've been doing.  We'll just be using it to draw the pretrained network's graphs using the utility function I've given you.\n",
    "\n",
    "Be sure to interact with the graph and click on the various modules.\n",
    "\n",
    "For instance, if you've loaded the `inception` v5 network, locate the \"input\" to the network.  This is where we feed the image, the input placeholder (typically what we've been denoting as `X` in our own networks).  From there, it goes to the \"conv2d0\" variable scope (i.e. this uses the code: `with tf.variable_scope(\"conv2d0\")` to create a set of operations with the prefix \"conv2d0/\".  If you expand this scope, you'll see another scope, \"pre_relu\".  This is created using another `tf.variable_scope(\"pre_relu\")`, so that any new variables will have the prefix \"conv2d0/pre_relu\".  Finally, inside here, you'll see the convolution operation (`tf.nn.conv2d`) and the 4d weight tensor, \"w\" (e.g. created using `tf.get_variable`), used for convolution (and so has the name, \"conv2d0/pre_relu/w\".  Just after the convolution is the addition of the bias, b.  And finally after exiting the \"pre_relu\" scope, you should be able to see the \"conv2d0\" operation which applies the relu nonlinearity.  In summary, that region of the graph can be created in Tensorflow like so:\n",
    "\n",
    "```python\n",
    "input = tf.placeholder(...)\n",
    "with tf.variable_scope('conv2d0'):\n",
    "    with tf.variable_scope('pre_relu'):\n",
    "        w = tf.get_variable(...)\n",
    "        h = tf.nn.conv2d(input, h, ...)\n",
    "        b = tf.get_variable(...)\n",
    "        h = tf.nn.bias_add(h, b)\n",
    "    h = tf.nn.relu(h)\n",
    "```"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "nb_utils.show_graph(net['graph_def'])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "If you open up the \"mixed3a\" node above (double click on it), you'll see the first \"inception\" module.  This network encompasses a few advanced concepts that we did not have time to discuss during the lecture, including residual connections, feature concatenation, parallel convolution streams, 1x1 convolutions, and including negative labels in the softmax layer.  I'll expand on the 1x1 convolutions here, but please feel free to skip ahead if this isn't of interest to you.\n",
    "\n",
    "<a name=\"a-note-on-1x1-convolutions\"></a>\n",
    "## A Note on 1x1 Convolutions\n",
    "\n",
    "The 1x1 convolutions are setting the `ksize` parameter of the kernels to 1.  This is effectively allowing you to change the number of dimensions.  Remember that you need a 4-d tensor as input to a convolution.  Let's say its dimensions are $\\text{N}\\ x\\ \\text{H}\\ x\\ \\text{W}\\ x\\ \\text{C}_I$, where $\\text{C}_I$ represents the number of channels the image has.  Let's say it is an RGB image, then $\\text{C}_I$ would be 3.  Or later in the network, if we have already convolved it, it might be 64 channels instead.  Regardless, when you convolve it w/ a $\\text{K}_H\\ x\\ \\text{K}_W\\ x\\ \\text{C}_I\\ x\\ \\text{C}_O$ filter, where $\\text{K}_H$ is 1 and $\\text{K}_W$ is also 1, then the filters size is: $1\\ x\\ 1\\ x\\ \\text{C}_I$ and this is perfomed for each output channel $\\text{C}_O$.  What this is doing is filtering the information only in the channels dimension, not the spatial dimensions.  The output of this convolution will be a $\\text{N}\\ x\\ \\text{H}\\ x\\ \\text{W}\\ x\\ \\text{C}_O$ output tensor.  The only thing that changes in the output is the number of output filters.\n",
    "\n",
    "The 1x1 convolution operation is essentially reducing the amount of information in the channels dimensions before performing a much more expensive operation, e.g. a 3x3 or 5x5 convolution.  Effectively, it is a very clever trick for dimensionality reduction used in many state of the art convolutional networks.  Another way to look at it is that it is preserving the spatial information, but at each location, there is a fully connected network taking all the information from every input channel, $\\text{C}_I$, and reducing it down to $\\text{C}_O$ channels (or could easily also be up, but that is not the typical use case for this).  So it's not really a convolution, but we can use the convolution operation to perform it at every location in our image.\n",
    "\n",
    "If you are interested in reading more about this architecture, I highly encourage you to read [Network in Network](https://arxiv.org/pdf/1312.4400v3.pdf), Christian Szegedy's work on the [Inception network](http://www.cs.unc.edu/~wliu/papers/GoogLeNet.pdf), Highway Networks, Residual Networks, and Ladder Networks.\n",
    "\n",
    "In this course, we'll stick to focusing on the applications of these, while trying to delve as much into the code as possible.\n",
    "\n",
    "<a name=\"network-labels\"></a>\n",
    "## Network Labels\n",
    "\n",
    "Let's now look at the labels:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "net['labels']"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "label_i = 851\n",
    "print(net['labels'][label_i])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a name=\"using-context-managers\"></a>\n",
    "## Using Context Managers\n",
    "\n",
    "Up until now, we've mostly used a single `tf.Session` within a notebook and didn't give it much thought.  Now that we're using some bigger models, we're going to have to be more careful.  Using a big model and being careless with our session can result in a lot of unexpected behavior, program crashes, and out of memory errors.  The VGG network and the I2V networks are quite large.  So we'll need to start being more careful with our sessions using context managers.\n",
    "\n",
    "Let's see how this works w/ VGG:\n",
    "\n",
    "<h3><font color='red'>TODO! COMPLETE THIS SECTION!</font></h3>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "# Load the VGG network.  Scroll back up to where we loaded the inception\n",
    "# network if you are unsure.  It is inside the \"vgg16\" module...\n",
    "net = ..\n",
    "\n",
    "assert(net['labels'][0] == (0, 'n01440764 tench, Tinca tinca'))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "# Let's explicity use the CPU, since we don't gain anything using the GPU\n",
    "# when doing Deep Dream (it's only a single image, benefits come w/ many images).\n",
    "device = '/cpu:0'\n",
    "\n",
    "# We'll now explicitly create a graph\n",
    "g = tf.Graph()\n",
    "\n",
    "# And here is a context manager.  We use the python \"with\" notation to create a context\n",
    "# and create a session that only exists within this indent,  as soon as we leave it,\n",
    "# the session is automatically closed!  We also tell the session which graph to use.\n",
    "# We can pass a second context after the comma,\n",
    "# which we'll use to be explicit about using the CPU instead of a GPU.\n",
    "with tf.Session(graph=g) as sess, g.device(device):\n",
    "    \n",
    "    # Now load the graph_def, which defines operations and their values into `g`\n",
    "    tf.import_graph_def(net['graph_def'], name='net')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "# Now we can get all the operations that belong to the graph `g`:\n",
    "names = [op.name for op in g.get_operations()]\n",
    "print(names)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a name=\"part-2---visualizing-gradients\"></a>\n",
    "# Part 2 - Visualizing Gradients\n",
    "\n",
    "Now that we know how to load a network and extract layers from it, let's grab only the pooling layers:\n",
    "\n",
    "<h3><font color='red'>TODO! COMPLETE THIS SECTION!</font></h3>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "# First find all the pooling layers in the network.  You can\n",
    "# use list comprehension to iterate over all the \"names\" we just\n",
    "# created, finding whichever ones have the name \"pool\" in them.\n",
    "# Then be sure to append a \":0\" to the names\n",
    "features = ...\n",
    "\n",
    "# Let's print them\n",
    "print(features)\n",
    "\n",
    "# This is what we want to have at the end.  You could just copy this list\n",
    "# if you are stuck!\n",
    "assert(features == ['net/pool1:0', 'net/pool2:0', 'net/pool3:0', 'net/pool4:0', 'net/pool5:0'])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's also grab the input layer:\n",
    "\n",
    "<h3><font color='red'>TODO! COMPLETE THIS SECTION!</font></h3>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "# Use the function 'get_tensor_by_name' and the 'names' array to help you\n",
    "# get the first tensor in the network.  Remember you have to add \":0\" to the\n",
    "# name to get the output of an operation which is the tensor.\n",
    "x = ...\n",
    "\n",
    "assert(x.name == 'net/images:0')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We'll now try to find the gradient activation that maximizes a layer with respect to the input layer `x`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "def plot_gradient(img, x, feature, g, device='/cpu:0'):\n",
    "    \"\"\"Let's visualize the network's gradient activation\n",
    "    when backpropagated to the original input image.  This\n",
    "    is effectively telling us which pixels contribute to the\n",
    "    predicted layer, class, or given neuron with the layer\"\"\"\n",
    "    \n",
    "    # We'll be explicit about the graph and the device\n",
    "    # by using a context manager:\n",
    "    with tf.Session(graph=g) as sess, g.device(device):\n",
    "        saliency = tf.gradients(tf.reduce_mean(feature), x)\n",
    "        this_res = sess.run(saliency[0], feed_dict={x: img})\n",
    "        grad = this_res[0] / np.max(np.abs(this_res))\n",
    "        return grad"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's try this w/ an image now.  We're going to use the `plot_gradient` function to help us.  This is going to take our input image, run it through the network up to a layer, find the gradient of the mean of that layer's activation with respect to the input image, then backprop that gradient back to the input layer.  We'll then visualize the gradient by normalizing its values using the `utils.normalize` function."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "og = plt.imread('clinton.png')[..., :3]\n",
    "img = net['preprocess'](og)[np.newaxis]\n",
    "\n",
    "fig, axs = plt.subplots(1, len(features), figsize=(20, 10))\n",
    "\n",
    "for i in range(len(features)):\n",
    "    axs[i].set_title(features[i])\n",
    "    grad = plot_gradient(img, x, g.get_tensor_by_name(features[i]), g)\n",
    "    axs[i].imshow(utils.normalize(grad))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a name=\"part-3---basic-deep-dream\"></a>\n",
    "# Part 3 - Basic Deep Dream\n",
    "\n",
    "In the lecture we saw how Deep Dreaming takes the backpropagated gradient activations and simply adds it to the image, running the same process again and again in a loop.  We also saw many tricks one can add to this idea, such as infinitely zooming into the image by cropping and scaling, adding jitter by randomly moving the image around, or adding constraints on the total activations.\n",
    "\n",
    "Have a look here for inspiration:\n",
    "\n",
    "https://research.googleblog.com/2015/06/inceptionism-going-deeper-into-neural.html  \n",
    "\n",
    "\n",
    "https://photos.google.com/share/AF1QipPX0SCl7OzWilt9LnuQliattX4OUCj_8EP65_cTVnBmS1jnYgsGQAieQUc1VQWdgQ?key=aVBxWjhwSzg2RjJWLWRuVFBBZEN1d205bUdEMnhB  \n",
    "\n",
    "https://mtyka.github.io/deepdream/2016/02/05/bilateral-class-vis.html\n",
    "\n",
    "Let's stick the necessary bits in a function and try exploring how deep dream amplifies the representations of the chosen layers:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "def dream(img, gradient, step, net, x, n_iterations=50, plot_step=10):\n",
    "    # Copy the input image as we'll add the gradient to it in a loop\n",
    "    img_copy = img.copy()\n",
    "\n",
    "    fig, axs = plt.subplots(1, n_iterations // plot_step, figsize=(20, 10))\n",
    "\n",
    "    with tf.Session(graph=g) as sess, g.device(device):\n",
    "        for it_i in range(n_iterations):\n",
    "\n",
    "            # This will calculate the gradient of the layer we chose with respect to the input image.\n",
    "            this_res = sess.run(gradient[0], feed_dict={x: img_copy})[0]\n",
    "\n",
    "            # Let's normalize it by the maximum activation\n",
    "            this_res /= (np.max(np.abs(this_res) + 1e-8))\n",
    "            \n",
    "            # Or alternatively, we can normalize by standard deviation\n",
    "            # this_res /= (np.std(this_res) + 1e-8)\n",
    "            \n",
    "            # Or we could use the `utils.normalize function:\n",
    "            # this_res = utils.normalize(this_res)\n",
    "            \n",
    "            # Experiment with all of the above options.  They will drastically\n",
    "            # effect the resulting dream, and really depend on the network\n",
    "            # you use, and the way the network handles normalization of the\n",
    "            # input image, and the step size you choose!  Lots to explore!\n",
    "\n",
    "            # Then add the gradient back to the input image\n",
    "            # Think about what this gradient represents?\n",
    "            # It says what direction we should move our input\n",
    "            # in order to meet our objective stored in \"gradient\"\n",
    "            img_copy += this_res * step\n",
    "\n",
    "            # Plot the image\n",
    "            if (it_i + 1) % plot_step == 0:\n",
    "                m = net['deprocess'](img_copy[0])\n",
    "                axs[it_i // plot_step].imshow(m)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "# We'll run it for 3 iterations\n",
    "n_iterations = 3\n",
    "\n",
    "# Think of this as our learning rate.  This is how much of\n",
    "# the gradient we'll add back to the input image\n",
    "step = 1.0\n",
    "\n",
    "# Every 1 iterations, we'll plot the current deep dream\n",
    "plot_step = 1"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's now try running Deep Dream for every feature, each of our 5 pooling layers.  We'll need to get the layer corresponding to our feature.  Then find the gradient of this layer's mean activation with respect to our input, `x`.  Then pass these to our `dream` function.  This can take awhile (about 10 minutes using the CPU on my Macbook Pro)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "for feature_i in range(len(features)):\n",
    "    with tf.Session(graph=g) as sess, g.device(device):\n",
    "        # Get a feature layer\n",
    "        layer = g.get_tensor_by_name(features[feature_i])\n",
    "\n",
    "        # Find the gradient of this layer's mean activation\n",
    "        # with respect to the input image\n",
    "        gradient = tf.gradients(tf.reduce_mean(layer), x)\n",
    "        \n",
    "        # Dream w/ our image\n",
    "        dream(img, gradient, step, net, x, n_iterations=n_iterations, plot_step=plot_step)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Instead of using an image, we can use an image of noise and see how it \"hallucinates\" the representations that the layer most responds to:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "noise = net['preprocess'](\n",
    "    np.random.rand(256, 256, 3) * 0.1 + 0.45)[np.newaxis]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We'll do the same thing as before, now w/ our noise image:\n",
    "\n",
    "<h3><font color='red'>TODO! COMPLETE THIS SECTION!</font></h3>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "for feature_i in range(len(features)):\n",
    "    with tf.Session(graph=g) as sess, g.device(device):\n",
    "        # Get a feature layer\n",
    "        layer = ...\n",
    "\n",
    "        # Find the gradient of this layer's mean activation\n",
    "        # with respect to the input image\n",
    "        gradient = ...\n",
    "        \n",
    "        # Dream w/ the noise image.  Complete this!\n",
    "        dream(...)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a name=\"part-4---deep-dream-extensions\"></a>\n",
    "# Part 4 - Deep Dream Extensions\n",
    "\n",
    "As we saw in the lecture, we can also use the final softmax layer of a network to use during deep dream.  This allows us to be explicit about the object we want hallucinated in an image.\n",
    "\n",
    "<a name=\"using-the-softmax-layer\"></a>\n",
    "## Using the Softmax Layer\n",
    "\n",
    "Let's get another image to play with, preprocess it, and then make it 4-dimensional.\n",
    "\n",
    "<h3><font color='red'>TODO! COMPLETE THIS SECTION!</font></h3>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "# Load your own image here\n",
    "og = ...\n",
    "plt.imshow(og)\n",
    "\n",
    "# Preprocess the image and make sure it is 4-dimensional by adding a new axis to the 0th dimension:\n",
    "img = ...\n",
    "\n",
    "assert(img.ndim == 4)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "# Let's get the softmax layer\n",
    "print(names[-2])\n",
    "layer = g.get_tensor_by_name(names[-2] + \":0\")\n",
    "\n",
    "# And find its shape\n",
    "with tf.Session(graph=g) as sess, g.device(device):\n",
    "    layer_shape = tf.shape(layer).eval(feed_dict={x:img})\n",
    "\n",
    "# We can find out how many neurons it has by feeding it an image and\n",
    "# calculating the shape.  The number of output channels is the last dimension.\n",
    "n_els = layer_shape[-1]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "# Let's pick a label.  First let's print out every label and then find one we like:\n",
    "print(net['labels'])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<h3><font color='red'>TODO! COMPLETE THIS SECTION!</font></h3>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "# Pick a neuron.  Or pick a random one.  This should be 0-n_els\n",
    "neuron_i = ...\n",
    "\n",
    "print(net['labels'][neuron_i])\n",
    "assert(neuron_i >= 0 and neuron_i < n_els)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "# And we'll create an activation of this layer which is very close to 0\n",
    "layer_vec = np.ones(layer_shape) / 100.0\n",
    "\n",
    "# Except for the randomly chosen neuron which will be very close to 1\n",
    "layer_vec[..., neuron_i] = 0.99"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's decide on some parameters of our deep dream.  We'll need to decide how many iterations to run for.  And we'll plot the result every few iterations, also saving it so that we can produce a GIF.  And at every iteration, we need to decide how much to ascend our gradient.\n",
    "\n",
    "<h3><font color='red'>TODO! COMPLETE THIS SECTION!</font></h3>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "# Explore different parameters for this section.\n",
    "n_iterations = 51\n",
    "\n",
    "plot_step = 5\n",
    "\n",
    "# If you use a different network, you will definitely need to experiment\n",
    "# with the step size, as each network normalizes the input image differently.\n",
    "step = 0.2"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now let's dream.  We're going to define a context manager to create a session and use our existing graph, and make sure we use the CPU device, as there is no gain in using GPU, and we have much more CPU memory than GPU memory."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "imgs = []\n",
    "with tf.Session(graph=g) as sess, g.device(device):\n",
    "    gradient = tf.gradients(tf.reduce_max(layer), x)\n",
    "\n",
    "    # Copy the input image as we'll add the gradient to it in a loop\n",
    "    img_copy = img.copy()\n",
    "\n",
    "    with tf.Session(graph=g) as sess, g.device(device):\n",
    "        for it_i in range(n_iterations):\n",
    "\n",
    "            # This will calculate the gradient of the layer we chose with respect to the input image.\n",
    "            this_res = sess.run(gradient[0], feed_dict={\n",
    "                    x: img_copy, layer: layer_vec})[0]\n",
    "            \n",
    "            # Let's normalize it by the maximum activation\n",
    "            this_res /= (np.max(np.abs(this_res) + 1e-8))\n",
    "            \n",
    "            # Or alternatively, we can normalize by standard deviation\n",
    "            # this_res /= (np.std(this_res) + 1e-8)\n",
    "\n",
    "            # Then add the gradient back to the input image\n",
    "            # Think about what this gradient represents?\n",
    "            # It says what direction we should move our input\n",
    "            # in order to meet our objective stored in \"gradient\"\n",
    "            img_copy += this_res * step\n",
    "\n",
    "            # Plot the image\n",
    "            if (it_i + 1) % plot_step == 0:\n",
    "                m = net['deprocess'](img_copy[0])\n",
    "\n",
    "                plt.figure(figsize=(5, 5))\n",
    "                plt.grid('off')\n",
    "                plt.imshow(m)\n",
    "                plt.show()\n",
    "                \n",
    "                imgs.append(m)\n",
    "                "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "# Save the gif\n",
    "gif.build_gif(imgs, saveto='softmax.gif')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "ipyd.Image(url='softmax.gif?i={}'.format(\n",
    "        np.random.rand()), height=300, width=300)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a name=\"fractal\"></a>\n",
    "## Fractal\n",
    "\n",
    "During the lecture we also saw a simple trick for creating an infinite fractal: crop the image and then resize it.  This can produce some lovely aesthetics and really show some strong object hallucinations if left long enough and with the right parameters for step size/normalization/regularization.  Feel free to experiment with the code below, adding your own regularizations as shown in the lecture to produce different results!\n",
    "\n",
    "<h3><font color='red'>TODO! COMPLETE THIS SECTION!</font></h3>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "n_iterations = 101\n",
    "plot_step = 10\n",
    "step = 0.1\n",
    "crop = 1\n",
    "imgs = []\n",
    "\n",
    "n_imgs, height, width, *ch = img.shape\n",
    "\n",
    "with tf.Session(graph=g) as sess, g.device(device):\n",
    "    # Explore changing the gradient here from max to mean\n",
    "    # or even try using different concepts we learned about\n",
    "    # when creating style net, such as using a total variational\n",
    "    # loss on `x`.\n",
    "    gradient = tf.gradients(tf.reduce_max(layer), x)\n",
    "\n",
    "    # Copy the input image as we'll add the gradient to it in a loop\n",
    "    img_copy = img.copy()\n",
    "\n",
    "    with tf.Session(graph=g) as sess, g.device(device):\n",
    "        for it_i in range(n_iterations):\n",
    "\n",
    "            # This will calculate the gradient of the layer\n",
    "            # we chose with respect to the input image.\n",
    "            this_res = sess.run(gradient[0], feed_dict={\n",
    "                    x: img_copy, layer: layer_vec})[0]\n",
    "            \n",
    "            # This is just one way we could normalize the\n",
    "            # gradient.  It helps to look at the range of your image's\n",
    "            # values, e.g. if it is 0 - 1, or -115 to +115,\n",
    "            # and then consider the best way to normalize the gradient.\n",
    "            # For some networks, it might not even be necessary\n",
    "            # to perform this normalization, especially if you\n",
    "            # leave the dream to run for enough iterations.\n",
    "            # this_res = this_res / (np.std(this_res) + 1e-10)\n",
    "            this_res = this_res / (np.max(np.abs(this_res)) + 1e-10)\n",
    "\n",
    "            # Then add the gradient back to the input image\n",
    "            # Think about what this gradient represents?\n",
    "            # It says what direction we should move our input\n",
    "            # in order to meet our objective stored in \"gradient\"\n",
    "            img_copy += this_res * step\n",
    "            \n",
    "            # Optionally, we could apply any number of regularization\n",
    "            # techniques... Try exploring different ways of regularizing\n",
    "            # gradient. ascent process.  If you are adventurous, you can\n",
    "            # also explore changing the gradient above using a\n",
    "            # total variational loss, as we used in the style net\n",
    "            # implementation during the lecture.  I leave that to you\n",
    "            # as an exercise!\n",
    "\n",
    "            # Crop a 1 pixel border from height and width\n",
    "            img_copy = img_copy[:, crop:-crop, crop:-crop, :]\n",
    "\n",
    "            # Resize (Note: in the lecture, we used scipy's resize which\n",
    "            # could not resize images outside of 0-1 range, and so we had\n",
    "            # to store the image ranges.  This is a much simpler resize\n",
    "            # method that allows us to `preserve_range`.)\n",
    "            img_copy = resize(img_copy[0], (height, width), order=3,\n",
    "                         clip=False, preserve_range=True\n",
    "                         )[np.newaxis].astype(np.float32)\n",
    "\n",
    "            # Plot the image\n",
    "            if (it_i + 1) % plot_step == 0:\n",
    "                m = net['deprocess'](img_copy[0])\n",
    "\n",
    "                plt.grid('off')\n",
    "                plt.imshow(m)\n",
    "                plt.show()\n",
    "                \n",
    "                imgs.append(m)\n",
    "\n",
    "# Create a GIF\n",
    "gif.build_gif(imgs, saveto='fractal.gif')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "ipyd.Image(url='fractal.gif?i=2', height=300, width=300)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a name=\"guided-hallucinations\"></a>\n",
    "## Guided Hallucinations\n",
    "\n",
    "Instead of following the gradient of an arbitrary mean or max of a particular layer's activation, or a particular object that we want to synthesize, we can also try to guide our image to look like another image.  One way to try this is to take one image, the guide, and find the features at a particular layer or layers.  Then, we take our synthesis image and find the gradient which makes its own layers activations look like the guide image.\n",
    "\n",
    "<h3><font color='red'>TODO! COMPLETE THIS SECTION!</font></h3>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "# Replace these with your own images!\n",
    "guide_og = plt.imread(...)[..., :3]\n",
    "dream_og = plt.imread(...)[..., :3]\n",
    "\n",
    "assert(guide_og.ndim == 3 and guide_og.shape[-1] == 3)\n",
    "assert(dream_og.ndim == 3 and dream_og.shape[-1] == 3)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Preprocess both images:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "guide_img = net['preprocess'](guide_og)[np.newaxis]\n",
    "dream_img = net['preprocess'](dream_og)[np.newaxis]\n",
    "\n",
    "fig, axs = plt.subplots(1, 2, figsize=(7, 4))\n",
    "axs[0].imshow(guide_og)\n",
    "axs[1].imshow(dream_og)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Like w/ Style Net, we are going to measure how similar the features in the guide image are to the dream images.  In order to do that, we'll calculate the dot product.  Experiment with other measures such as l1 or l2 loss to see how this impacts the resulting Dream!\n",
    "\n",
    "<h3><font color='red'>TODO! COMPLETE THIS SECTION!</font></h3>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "x = g.get_tensor_by_name(names[0] + \":0\")\n",
    "\n",
    "# Experiment with the weighting\n",
    "feature_loss_weight = 1.0\n",
    "\n",
    "with tf.Session(graph=g) as sess, g.device(device):\n",
    "    feature_loss = tf.Variable(0.0)\n",
    "    \n",
    "    # Explore different layers/subsets of layers.  This is just an example.\n",
    "    for feature_i in features[3:5]:\n",
    "\n",
    "        # Get the activation of the feature\n",
    "        layer = g.get_tensor_by_name(feature_i)\n",
    "        \n",
    "        # Do the same for our guide image\n",
    "        guide_layer = sess.run(layer, feed_dict={x: guide_img})\n",
    "        \n",
    "        # Now we need to measure how similar they are!\n",
    "        # We'll use the dot product, which requires us to first reshape both\n",
    "        # features to a 2D vector.  But you should experiment with other ways\n",
    "        # of measuring similarity such as l1 or l2 loss.\n",
    "        \n",
    "        # Reshape each layer to 2D vector\n",
    "        layer = tf.reshape(layer, [-1, 1])\n",
    "        guide_layer = guide_layer.reshape(-1, 1)\n",
    "        \n",
    "        # Now calculate their dot product\n",
    "        correlation = tf.matmul(guide_layer.T, layer)\n",
    "        \n",
    "        # And weight the loss by a factor so we can control its influence\n",
    "        feature_loss += feature_loss_weight * correlation"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We'll now use another measure that we saw when developing Style Net during the lecture.  This measure the pixel to pixel difference of neighboring pixels.  What we're doing when we try to optimize a gradient that makes the mean differences small is saying, we want the difference to be low.  This allows us to smooth our image in the same way that we did using the Gaussian to blur the image.  \n",
    "\n",
    "<h3><font color='red'>TODO! COMPLETE THIS SECTION!</font></h3>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "n_img, height, width, ch = dream_img.shape\n",
    "\n",
    "# We'll weight the overall contribution of the total variational loss\n",
    "# Experiment with this weighting\n",
    "tv_loss_weight = 1.0\n",
    "\n",
    "with tf.Session(graph=g) as sess, g.device(device):\n",
    "    # Penalize variations in neighboring pixels, enforcing smoothness\n",
    "    dx = tf.square(x[:, :height - 1, :width - 1, :] - x[:, :height - 1, 1:, :])\n",
    "    dy = tf.square(x[:, :height - 1, :width - 1, :] - x[:, 1:, :width - 1, :])\n",
    "    \n",
    "    # We will calculate their difference raised to a power to push smaller\n",
    "    # differences closer to 0 and larger differences higher.\n",
    "    # Experiment w/ the power you raise this to to see how it effects the result\n",
    "    tv_loss = tv_loss_weight * tf.reduce_mean(tf.pow(dx + dy, 1.2))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now we train just like before, except we'll need to combine our two loss terms, `feature_loss` and `tv_loss` by simply adding them!  The one thing we have to keep in mind is that we want to minimize the `tv_loss` while maximizing the `feature_loss`.  That means we'll need to use the negative `tv_loss` and the positive `feature_loss`.  As an experiment, try just optimizing the `tv_loss` and removing the `feature_loss` from the `tf.gradients` call.  What happens?\n",
    "\n",
    "<h3><font color='red'>TODO! COMPLETE THIS SECTION!</font></h3>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "# Experiment with the step size!\n",
    "step = 0.1\n",
    "\n",
    "imgs = []\n",
    "\n",
    "with tf.Session(graph=g) as sess, g.device(device):\n",
    "    # Experiment with just optimizing the tv_loss or negative tv_loss to understand what it is doing!\n",
    "    gradient = tf.gradients(-tv_loss + feature_loss, x)\n",
    "\n",
    "    # Copy the input image as we'll add the gradient to it in a loop\n",
    "    img_copy = dream_img.copy()\n",
    "\n",
    "    with tf.Session(graph=g) as sess, g.device(device):\n",
    "        sess.run(tf.global_variables_initializer())\n",
    "        \n",
    "        for it_i in range(n_iterations):\n",
    "\n",
    "            # This will calculate the gradient of the layer we chose with respect to the input image.\n",
    "            this_res = sess.run(gradient[0], feed_dict={x: img_copy})[0]\n",
    "            \n",
    "            # Let's normalize it by the maximum activation\n",
    "            this_res /= (np.max(np.abs(this_res) + 1e-8))\n",
    "            \n",
    "            # Or alternatively, we can normalize by standard deviation\n",
    "            # this_res /= (np.std(this_res) + 1e-8)\n",
    "\n",
    "            # Then add the gradient back to the input image\n",
    "            # Think about what this gradient represents?\n",
    "            # It says what direction we should move our input\n",
    "            # in order to meet our objective stored in \"gradient\"\n",
    "            img_copy += this_res * step\n",
    "\n",
    "            # Plot the image\n",
    "            if (it_i + 1) % plot_step == 0:\n",
    "                m = net['deprocess'](img_copy[0])\n",
    "\n",
    "                plt.figure(figsize=(5, 5))\n",
    "                plt.grid('off')\n",
    "                plt.imshow(m)\n",
    "                plt.show()\n",
    "                \n",
    "                imgs.append(m)\n",
    "\n",
    "gif.build_gif(imgs, saveto='guided.gif')                "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "ipyd.Image(url='guided.gif?i=0', height=300, width=300)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a name=\"further-explorations\"></a>\n",
    "## Further Explorations\n",
    "\n",
    "In the `libs` module, I've included a `deepdream` module which has two functions for performing Deep Dream and the Guided Deep Dream.  Feel free to explore these to create your own deep dreams.\n",
    "\n",
    "<a name=\"part-5---style-net\"></a>\n",
    "# Part 5 - Style Net\n",
    "\n",
    "We'll now work on creating our own style net implementation.  We've seen all the steps for how to do this during the lecture, and you can always refer to the [Lecture Transcript](lecture-4.ipynb) if you need to.  I want to you to explore using different networks and different layers in creating your content and style losses.  This is completely unexplored territory so it can be frustrating to find things that work.  Think of this as your empty canvas!  If you are really stuck, you will find a `stylenet` implementation under the `libs` module that you can use instead.\n",
    "\n",
    "Have a look here for inspiration:\n",
    "\n",
    "https://mtyka.github.io/code/2015/10/02/experiments-with-style-transfer.html  \n",
    "\n",
    "http://kylemcdonald.net/stylestudies/\n",
    "\n",
    "<a name=\"network\"></a>\n",
    "## Network\n",
    "\n",
    "Let's reset the graph and load up a network.  I'll include code here for loading up any of our pretrained networks so you can explore each of them!\n",
    "\n",
    "<h3><font color='red'>TODO! COMPLETE THIS SECTION!</font></h3>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "sess.close()\n",
    "tf.reset_default_graph()\n",
    "\n",
    "# Stick w/ VGG for now, and then after you see how\n",
    "# the next few sections work w/ this network, come back\n",
    "# and explore the other networks.\n",
    "\n",
    "net = vgg16.get_vgg_model()\n",
    "# net = vgg16.get_vgg_face_model()\n",
    "# net = inception.get_inception_model(version='v5')\n",
    "# net = inception.get_inception_model(version='v3')\n",
    "# net = i2v.get_i2v_model()\n",
    "# net = i2v.get_i2v_tag_model()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "# Let's explicity use the CPU, since we don't gain anything using the GPU\n",
    "# when doing Deep Dream (it's only a single image, benefits come w/ many images).\n",
    "device = '/cpu:0'\n",
    "\n",
    "# We'll now explicitly create a graph\n",
    "g = tf.Graph()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's now import the graph definition into our newly created Graph using a context manager and specifying that we want to use the CPU."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "# And here is a context manager.  We use the python \"with\" notation to create a context\n",
    "# and create a session that only exists within this indent,  as soon as we leave it,\n",
    "# the session is automatically closed!  We also tel the session which graph to use.\n",
    "# We can pass a second context after the comma,\n",
    "# which we'll use to be explicit about using the CPU instead of a GPU.\n",
    "with tf.Session(graph=g) as sess, g.device(device):\n",
    "    \n",
    "    # Now load the graph_def, which defines operations and their values into `g`\n",
    "    tf.import_graph_def(net['graph_def'], name='net')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's then grab the names of every operation in our network:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "names = [op.name for op in g.get_operations()]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now we need an image for our content image and another one for our style image."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "content_og = plt.imread('arles.png')[..., :3]\n",
    "style_og = plt.imread('clinton.png')[..., :3]\n",
    "\n",
    "fig, axs = plt.subplots(1, 2)\n",
    "axs[0].imshow(content_og)\n",
    "axs[0].set_title('Content Image')\n",
    "axs[0].grid('off')\n",
    "axs[1].imshow(style_og)\n",
    "axs[1].set_title('Style Image')\n",
    "axs[1].grid('off')\n",
    "\n",
    "# We'll save these with a specific name to include in your submission\n",
    "plt.imsave(arr=content_og, fname='content.png')\n",
    "plt.imsave(arr=style_og, fname='style.png')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "content_img = net['preprocess'](content_og)[np.newaxis]\n",
    "style_img = net['preprocess'](style_og)[np.newaxis]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's see what the network classifies these images as just for fun:\n",
    "\n",
    "<h3><font color='red'>TODO! COMPLETE THIS SECTION!</font></h3>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "# Grab the tensor defining the input to the network\n",
    "x = ...\n",
    "\n",
    "# And grab the tensor defining the softmax layer of the network\n",
    "softmax = ...\n",
    "\n",
    "# Remember from the lecture that we have to set the dropout\n",
    "# \"keep probability\" to 1.0.\n",
    "keep_probability = np.ones([1, 4096])\n",
    "\n",
    "for img in [content_img, style_img]:\n",
    "    with tf.Session(graph=g) as sess, g.device('/cpu:0'):\n",
    "        res = softmax.eval(feed_dict={x: img,\n",
    "                    'net/dropout_1/random_uniform:0': keep_probability,\n",
    "                    'net/dropout/random_uniform:0': keep_probability})[0]\n",
    "        print([(res[idx], net['labels'][idx])\n",
    "               for idx in res.argsort()[-5:][::-1]])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a name=\"content-features\"></a>\n",
    "## Content Features\n",
    "\n",
    "We're going to need to find the layer or layers we want to use to help us define our \"content loss\".  Recall from the lecture when we used VGG, we used the 4th convolutional layer."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "print(names)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Pick a layer for using for the content features.  If you aren't using VGG remember to get rid of the dropout stuff!\n",
    "\n",
    "<h3><font color='red'>TODO! COMPLETE THIS SECTION!</font></h3>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "# Experiment w/ different layers here.  You'll need to change this if you \n",
    "# use another network!\n",
    "content_layer = 'net/conv3_2/conv3_2:0'\n",
    "\n",
    "with tf.Session(graph=g) as sess, g.device('/cpu:0'):\n",
    "    content_features = g.get_tensor_by_name(content_layer).eval(\n",
    "            session=sess,\n",
    "            feed_dict={x: content_img,\n",
    "                    'net/dropout_1/random_uniform:0': keep_probability,\n",
    "                    'net/dropout/random_uniform:0': keep_probability})"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a name=\"style-features\"></a>\n",
    "## Style Features\n",
    "\n",
    "Let's do the same thing now for the style features.  We'll use more than 1 layer though so we'll append all the features in a list.  If you aren't using VGG remember to get rid of the dropout stuff!\n",
    "\n",
    "<h3><font color='red'>TODO! COMPLETE THIS SECTION!</font></h3>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "# Experiment with different layers and layer subsets.  You'll need to change these\n",
    "# if you use a different network!\n",
    "style_layers = ['net/conv1_1/conv1_1:0',\n",
    "                'net/conv2_1/conv2_1:0',\n",
    "                'net/conv3_1/conv3_1:0',\n",
    "                'net/conv4_1/conv4_1:0',\n",
    "                'net/conv5_1/conv5_1:0']\n",
    "style_activations = []\n",
    "\n",
    "with tf.Session(graph=g) as sess, g.device('/cpu:0'):\n",
    "    for style_i in style_layers:\n",
    "        style_activation_i = g.get_tensor_by_name(style_i).eval(\n",
    "            feed_dict={x: style_img,\n",
    "                    'net/dropout_1/random_uniform:0': keep_probability,\n",
    "                    'net/dropout/random_uniform:0': keep_probability})\n",
    "        style_activations.append(style_activation_i)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now we find the gram matrix which we'll use to optimize our features."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "style_features = []\n",
    "for style_activation_i in style_activations:\n",
    "    s_i = np.reshape(style_activation_i, [-1, style_activation_i.shape[-1]])\n",
    "    gram_matrix = np.matmul(s_i.T, s_i) / s_i.size\n",
    "    style_features.append(gram_matrix.astype(np.float32))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a name=\"remapping-the-input\"></a>\n",
    "## Remapping the Input\n",
    "\n",
    "We're almost done building our network.  We just have to change the input to the network to become \"trainable\".  Instead of a placeholder, we'll have a `tf.Variable`, which allows it to be trained.  We could set this to the content image, another image entirely, or an image of noise.  Experiment with all three options!\n",
    "\n",
    "<h3><font color='red'>TODO! COMPLETE THIS SECTION!</font></h3>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "tf.reset_default_graph()\n",
    "g = tf.Graph()\n",
    "\n",
    "# Get the network again\n",
    "net = vgg16.get_vgg_model()\n",
    "\n",
    "# Load up a session which we'll use to import the graph into.\n",
    "with tf.Session(graph=g) as sess, g.device('/cpu:0'):\n",
    "    # We can set the `net_input` to our content image\n",
    "    # or perhaps another image\n",
    "    # or an image of noise\n",
    "#     net_input = tf.Variable(content_img / 255.0)\n",
    "    net_input = tf.get_variable(\n",
    "       name='input',\n",
    "       shape=content_img.shape,\n",
    "       dtype=tf.float32,\n",
    "       initializer=tf.random_normal_initializer(\n",
    "           mean=np.mean(content_img), stddev=np.std(content_img)))\n",
    "    \n",
    "    # Now we load the network again, but this time replacing our placeholder\n",
    "    # with the trainable tf.Variable\n",
    "    tf.import_graph_def(\n",
    "        net['graph_def'],\n",
    "        name='net',\n",
    "        input_map={'images:0': net_input})"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a name=\"content-loss\"></a>\n",
    "## Content Loss\n",
    "\n",
    "In the lecture we saw that we'll simply find the l2 loss between our content layer features."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "with tf.Session(graph=g) as sess, g.device('/cpu:0'):\n",
    "    content_loss = tf.nn.l2_loss((g.get_tensor_by_name(content_layer) -\n",
    "                                 content_features) /\n",
    "                                 content_features.size)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a name=\"style-loss\"></a>\n",
    "## Style Loss\n",
    "\n",
    "Instead of straight l2 loss on the raw feature activations, we're going to calculate the gram matrix and find the loss between these.  Intuitively, this is finding what is common across all convolution filters, and trying to enforce the commonality between the synthesis and style image's gram matrix."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "with tf.Session(graph=g) as sess, g.device('/cpu:0'):\n",
    "    style_loss = np.float32(0.0)\n",
    "    for style_layer_i, style_gram_i in zip(style_layers, style_features):\n",
    "        layer_i = g.get_tensor_by_name(style_layer_i)\n",
    "        layer_shape = layer_i.get_shape().as_list()\n",
    "        layer_size = layer_shape[1] * layer_shape[2] * layer_shape[3]\n",
    "        layer_flat = tf.reshape(layer_i, [-1, layer_shape[3]])\n",
    "        gram_matrix = tf.matmul(tf.transpose(layer_flat), layer_flat) / layer_size\n",
    "        style_loss = tf.add(style_loss, tf.nn.l2_loss((gram_matrix - style_gram_i) / np.float32(style_gram_i.size)))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a name=\"total-variation-loss\"></a>\n",
    "## Total Variation Loss\n",
    "\n",
    "And just like w/ guided hallucinations, we'll try to enforce some smoothness using a total variation loss."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "def total_variation_loss(x):\n",
    "    h, w = x.get_shape().as_list()[1], x.get_shape().as_list()[1]\n",
    "    dx = tf.square(x[:, :h-1, :w-1, :] - x[:, :h-1, 1:, :])\n",
    "    dy = tf.square(x[:, :h-1, :w-1, :] - x[:, 1:, :w-1, :])\n",
    "    return tf.reduce_sum(tf.pow(dx + dy, 1.25))\n",
    "\n",
    "with tf.Session(graph=g) as sess, g.device('/cpu:0'):\n",
    "    tv_loss = total_variation_loss(net_input)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a name=\"training\"></a>\n",
    "## Training\n",
    "\n",
    "We're almost ready to train!  Let's just combine our three loss measures and stick it in an optimizer.\n",
    "\n",
    "<h3><font color='red'>TODO! COMPLETE THIS SECTION!</font></h3>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "with tf.Session(graph=g) as sess, g.device('/cpu:0'):\n",
    "    # Experiment w/ the weighting of these!  They produce WILDLY different\n",
    "    # results.\n",
    "    loss = 5.0 * content_loss + 1.0 * style_loss + 0.001 * tv_loss\n",
    "    optimizer = tf.train.AdamOptimizer(0.05).minimize(loss)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "And now iterate!  Feel free to play with the number of iterations or how often you save an image.  If you use a different network to VGG, then you will not need to feed in the dropout parameters like I've done here."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "imgs = []\n",
    "n_iterations = 100\n",
    "\n",
    "with tf.Session(graph=g) as sess, g.device('/cpu:0'):\n",
    "    sess.run(tf.global_variables_initializer())\n",
    "\n",
    "    # map input to noise\n",
    "    og_img = net_input.eval()\n",
    "    \n",
    "    for it_i in range(n_iterations):\n",
    "        _, this_loss, synth = sess.run([optimizer, loss, net_input], feed_dict={\n",
    "                    'net/dropout_1/random_uniform:0': keep_probability,\n",
    "                    'net/dropout/random_uniform:0': keep_probability})\n",
    "        print(\"%d: %f, (%f - %f)\" %\n",
    "            (it_i, this_loss, np.min(synth), np.max(synth)))\n",
    "        if it_i % 5 == 0:\n",
    "            m = vgg16.deprocess(synth[0])\n",
    "            imgs.append(m)\n",
    "            plt.imshow(m)\n",
    "            plt.show()\n",
    "    gif.build_gif(imgs, saveto='stylenet.gif')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "ipyd.Image(url='stylenet.gif?i=0', height=300, width=300)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a name=\"assignment-submission\"></a>\n",
    "# Assignment Submission\n",
    "\n",
    "After you've completed the notebook, create a zip file of the current directory using the code below.  This code will make sure you have included this completed ipython notebook and the following files named exactly as:\n",
    "\n",
    "<pre>\n",
    "    session-4/\n",
    "      session-4.ipynb\n",
    "      softmax.gif\n",
    "      fractal.gif\n",
    "      guided.gif\n",
    "      content.png\n",
    "      style.png\n",
    "      stylenet.gif\n",
    "</pre>\n",
    "\n",
    "You'll then submit this zip file for your third assignment on Kadenze for \"Assignment 4: Deep Dream and Style Net\"!  Remember to complete the rest of the assignment, gallery commenting on your peers work, to receive full credit!  If you have any questions, remember to reach out on the forums and connect with your peers or with me.\n",
    "\n",
    "To get assessed, you'll need to be a premium student!  This will allow you to build an online portfolio of all of your work and receive grades.  If you aren't already enrolled as a student, register now at http://www.kadenze.com/ and join the [#CADL](https://twitter.com/hashtag/CADL) community to see what your peers are doing! https://www.kadenze.com/courses/creative-applications-of-deep-learning-with-tensorflow/info\n",
    "\n",
    "Also, if you share any of the GIFs on Facebook/Twitter/Instagram/etc..., be sure to use the #CADL hashtag so that other students can find your work!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "utils.build_submission('session-4.zip',\n",
    "                       ('softmax.gif',\n",
    "                        'fractal.gif',\n",
    "                        'guided.gif',\n",
    "                        'content.png',\n",
    "                        'style.png',\n",
    "                        'stylenet.gif',\n",
    "                        'session-4.ipynb'))"
   ]
  }
 ],
 "metadata": {
  "anaconda-cloud": {},
  "kernelspec": {
   "display_name": "Python [default]",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.5.2"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 1
}
